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Abstract —In many real world applications, the information of 
an object can be obtained from multiple sources. The sources 
may provide different point of views based on their own origin. 
As a consequence, conflicting pieces of information are inevitable, 
which gives rise to a crucial problem: how to find the truth from 
these conflicts. Many truth-finding methods have been proposed 
to resolve conflicts based on information trustworthy (i.e. more 
appearance means more trustworthy) as well as source reliability. 
However, the factor of men’s involvement, i.e., information may 
be falsified by men with malicious intension, is more or less ig¬ 
nored in existing methods. Collaborating the possible relationship 
between information’s origins and men’s participation are still 
not studied in research. To deal with this challenge, we propose 
a method - Collaborating Information against Unreliable Views 
(CIUV) — in dealing with men’s involvement for finding the 
truth. CIUV contains 3 stages for interactively mitigating the 
impact of unreliable views, and calculate the truth by weighting 
possible biases between sources. We theoretically analyze the 
error bound of CIUV, and conduct Intensive experiments on 
real dataset for evaluation. The experimental results show that 
CIUV is feasible and has the smallest error compared with other 
methods. 


I. Introduction 

One significant challenge of Big Data is its wide variety, 
i.e., one can find the information about an object from various 
sources. The widely distributed sources may provide different 
point of views based on their origin or bias. When one want 
to find the truth out of these views, he should be able to 
determine how these conflicts come from, and which answers 
are more reliable than others. To deal with the challenge of 
variety, techniques have been proposed to assess the reliability 
of sources and derive the truth out of the information provided 
by sources mEl El a 13. A common way of these techniques 
is to average among sources where they vote each other to 
assign weights or are weighted by a third-party. 

Although many truth-finding techniques have been pro¬ 
posed, the factor of men’s involvement more or less slip 
researchers’ mind. For instance, someone may provide bias 
views with malicious intention so as to prevent others finding 
the truth; men are not incentivized to improve the reliability of 
the data they provide. Collaborating these possibilities are still 
not studied in research. Meanwhile, it has great application 
value to look up into the factor of men’s involvement for 
truth-finding from multiple sources. A typical such application 
scenario is the Gross Domestic Product (GDP) accounting. 
There is a constant and ongoing debate about the accuracy 
of the GDP statistics. Men argue that GDP are overestimated 
or underestimated based on their bias observation, such as 
companies deliberately record less transaction amount on their 
books for paying less tax, or exaggerate the transaction amount 
for getting more loans. With the doubt in mind, some people 


start to assess the accuracy of the GDP statistics through 
data obtained from other more reliable sources, e.g. using 
the railway cargo volume, electricity consumption and loan 
disbursed by banks as indicators of GDP isi. Those data 
provided by reliable sources are thought to be less manipulated 
by men, and people usually have more confidence on their 
accuracy. As can be seen, the key point of GDP accuracy lies 
in the reliability of sources. In many cases, men’s malicious 
intention or negligence would make a source unreliable. 

In this paper, we address the challenge of finding the truth 
with the consideration of men’s involvement. In particular, 
we are interested in dealing with unreliable views that are 
provided by men with malicious intention. In summary, we 
make the following contributions in this paper; 

• We model the problem of finding the truth with men’s 
involvement, and propose the method of Collaborating 
Information against Unreliable Views (CIUV), divided in 
3 stages for accuracy improvement. 

• CIUV interactively mitigates the impact of unreliable 
views, and calculates the truth by considering both mean 
and variance of error of views. We theoretically analyze 
the error bound of CIUV. 

• In experiment, we evaluate the performance of CIUV with 
real dataset under various settings. By comparison, the 
experimental results show that CIUV is feasible and has 
the smallest error. 


The remainder of this paper is organized as follows. We 
introduce the preliminary notations and their definitions in 
Section 
Section 
Section 


I^We give an overview on the workflow of CIUV in 
We detail the 3 stages of CIUV implementation in 
We analyze the error bound of CIUV in Section[V] 


III 


IV 


We provide the experim ental results in Section We review 
related work in Section VII Finally, we conclude our paper 
in Section rVlIIl 


II. Preliminaries 

We first introduce notations and their definitions, and then 
specify the objective in this section. 

Suppose that there are total m men providing the informa¬ 
tion about an object (i.e., the statistics data on GDP) separately. 
Let V = {vi\i = 1,2, denote the set of views of these 

information. These views may be in different representations. 
For instance, someone may give the data of railway cargo 
volume while another person contribute the data of electricity 
consumption in last several years. As known, both of these 
two types of data can be used as indicators of GDP. We map 
the view of different representations into an unified one. Let U 
be the possible view space of the unified representation, and 










M be the collection of mapping functions. For any v & V, 
there exists a corresponding u G U, and we can construct a 
mapping function m G M for v to u. The formal definition 
of mapping function is given as follow. 

Definition 1. A mapping function m G M : {u} — )■ {u} 
converts the view v G V provided by a man to the view u G U 
of unified representation. 

To evaluate the difference between any Ui G U and Uj £ 
U, we define a distance function d{ui,Uj) between them. By 
specifying the distance, we can distinguish how differences 
two views are, and further pick out the unreliable views. The 
formal definition of distance function is given as follow. 

Definition 2. The distance function d : U x U ^ uses a 
positive real number to represent the difference between any 
Ui,Uj G U. 

For simplicity, we use Ui — Uj to represent d{ui,Uj) here¬ 
after. In addition, also note that Ui + Uj and |Mi| are equal to 
d{ui,—Uj) and d{ui,0) respectively. 

Given U, our object is to determine which views are reliable 
or unreliable, and estimate a truth view u* as the output result. 
In detail, we are interested in quantifying the reliability of 
views, and based on it, averaging among weighted views. Let 
be the ground truth. The formal definition of u* is given 
as follow. 


Definition 3. The truth view 
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where the weights Wi, i = 1,2, ...,m are assigned to views 
respectively. By weight assignment, u* is supposed to be the 
best estimated view approaching the ground truth u^. 


III. Overview 

The workflow of CIUV is shown in Figure [T] It includes 3 
stages. 

• 1st stage: CIUV is interested in finding the truth u*, but 
he does not intend to expose his purpose. So CIUV first 
asks a series of questions besides the questions about u* 
as masks. Then the men provide their views as answers 
for these series of questions. Next CIUV determines the 
reliability of the views based on the ground truths he 
already knows or by checking the consistency of the 
answers a man provides. 

• 2nd stage: CIUV quantifies the reliability of views, and 
assigns greater weight to more reliable views and lighter 
weight to unreliable views. Then, CIUV averages on these 
weighted views to calculate a truth view u* as the output 
result. 

• 3rd stage: If CIUV is not satisfied with the result u*, 
he may re-ask the answers from the men by means of 
incentivization or punishment. CIUV would repeat this 
process until he get a acceptable result or the accuracy 
of the result is hardly improved. 


IV. CIUV Implementation 

We introduce CIUV following the sequence of its 3 stages. 
In the 1st stage, CIUV determines the reliability of views 
provided by men; in the 2nd stage, CIUV calculate a truth 
u* by weighting views; in the 3rd stage, CIUV improves 



the accuracy of calculated u* by means of incentivization or 
punishment. 


A. 1st stage 


In the 1st stage, CIUV aims to determine the reliability of 
views by asking a series of questions. There are two types of 
question should be considered. The first type is its ground 
truth is already known. For this type, CIUV first assesses 
the reliability by deriving the mean error p and variance cr^ 
of views provided by men. Let S denote the set of views 
corresponding to the series of questions. For the ith man, we 
have 


di = 


1^1 


( 2 ) 
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where, as defined in Section]^ — Si (a simple representation 
of d{s^,Si)) is the distance between the ground truth s® and 
the view Si provided by the ith man. The second type of 
questions (including u*) is its ground truth s® is still unknown. 
In this case, CIUV could use mean value of Si, i = 1,2,..., m 
Q or average over historical reliability values 0 as the truth 
s*, and then infer p. and a^. 

CIUV assumes that men provides their views independently, 
i.e., they does not collude to provide a same bias view. 
Otherwise, we can divide men into independent groups such 
that any two men belong different groups give unrelated views 
for CIUV. Since it is not the major concern in our paper, its 
discussion is omitted here. Thereupon, we can use Gaussian 
distribution to describe the error (or to say reliability) of views 
in CIUV. For the zth man, we have 


Ci ~ G{pi, cTj ) 


( 4 ) 






















B. 2nd stage 

Here we describe how to derive the truth u* in CIUV. 
The basic idea of our approach is that the reliable views 
are provided by reliable men, the ones with small p and 
simultaneously with small a. In other words, the truth u* 
should be close to views that are more trustworthy. We apply 
the weighted averaging strategy (Equation 0) to calculate 
u*. The critical point lies in it is how to assign weights 
Wi,W2, to views Ml, U 2 , Itm- 

With the assumption that men provides their views indepen¬ 
dently, we have the error e* of the truth u* following Gaussian 
distribution; 


G( 


E m ^r^m o 2 


E m 5 /\—\m \ n 

i=i Wi WiV 


(5) 


Without loss of generality, we restrict tfj = 1- Suppose 
that we have a error threshold value ct- The objective of CIUV 
is to maximize the probability P(|e*| < er): 


max P(|e*| < ex) 

Yh=i = IjtUi > 0. 


( 6 ) 


Let fj,* = 


Er 


and a* = In Figure 

As can be seen, under ttie 


we have /r* < ^2 ^nd cr* < cr^ 

Gaussian distribution crj^), T’de*! < ex) has a greater 

value compared with the other two distributions. In other 
words, to maximize P(|e*| < ex), CIUV should weight views 
with the smallest combination of |/i*| and a*. Substituting 
Gaussian probability-density function in P{\e*\ < ex), we 
have 


P{\e 


< er) = /_ 
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Unfortunately, it can be verified that the objective of maxi¬ 
mizing P(|e*| < ex) with Equation 0 cannot be directly 
solved. As a consequence, we are unable to derive the smallest 
combination of /i* and a* straightly. 

Instead, we apply an approximate measure by firstly dealing 
with n* and a* separately. We assume e* ^ G(/i*,0) and 
e* ^ G(0 ,(t*^). Then, the objective of CIUV (Equation 0) 
is turned into max P(|e*| < ex) and max P(|e*| < ex) 
respectively. Both of these two optimization problems are 
convex, so we can find global minimum guarantees of e* 
and e* with the best weight assignment [H. For e* it has 
many feasible solutions of maximizing P(|e*| < ex)- So we 
adopt a simple solution of weight assignment with the intuition 
that a /ii —^ 0 should be assigned a greater weight, otherwise 
assigned a lighter weight: 

1 




Im* 

For e*, its closed form solution is: 
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( 8 ) 
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We normalize both weight assignments with the constraint of 
Efci = 1 L™ 1 = 1 respectively. For e*, its 

weight assignment is; 


Wa.i = 


n m 
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Fig. 2. An example on Gaussian distribution. 


For e*, its weight assignment is; 

Ilfc=i 


,k^i ^k 
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Based on the above weight assignments of max P(|e* | < ex) 
and max P(|e*| < ex), we designate the weights for the 
problem of maximizing Ude*] < ex) by the following 
proportion; 

Wi CC (12) 

Normalizing the weights with the constraint of Wi = 1, 
we have 

Wi = -r -7 -p (13) 

CIUV adopts the weight assignment of Equation ( [T3| . Its 
advantage is that a view with either great /r* or great a* would 
be designated a light weight, and then we will have more 
confidence on the accuracy of the final output view u*. 


C. 3rd stage 

CIUV prompts the accuracy of the output view u* by means 
of incentivization or punishment. In implementation, CIUV 
does not care about the details about the adopted means. 
Instead, it assumes that the adopted means is effective, and 
focuses on the dispatch of accuracy improvement. Here, the 
effective means imply that each iteration of CIUV process will 
give more confidence on the accuracy of the output view u*. 
The formal definition is as follow; 


Definition 4 (Effective Means). With the means adopted, 
the (j \)th iteration of CIUV process has the maximum 
probability P(|e*(j + 1 )| < ex) compared with previous 
iterations, such as P{\e*{j -f 1)| < ex) > 7^(|e*(j)| < ex) 

In other words, the adopted means is effective in that 
they stimulate the men motivated to provide more trustworthy 
views. 

At the beginning of process, CIUV sets an infinite error 
value to the views iti(O), U 2 ( 0 ),..., Mm(0), namely ei(0) = 
62 ( 0 ) = ... = em(0) = 00 ; sets two threshold values: accept¬ 
able confidence probability R, i.e., P(|e*(j + 1)| < ex) > R, 
on error, and lower bound of confidence improvement D, i.e., 
+ 1)1 < ex) - Pi\e*U)\ < ex) > D. Then, at 
each iteration, CIUV verifies if R is reached or can hardly 
be reached (namely the confidence improvement is lower than 
D between two iterations in front and back). 

Assume that CIUV process is in (j + l)th iteration. CIUV 
firstly checks whether P(|e*(j -f 1)| < ex) > P or 

P(|e*(j -f 1)1 < ex) — P(|e*(j)| < ex) < D is satisfied. 
















TABLE I 

Views on GDP growth rate 


Algorithm 1: CIUV Process 
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Output: The output truth view u* 

Set j — —1, e*(0) = oo, (0) =oo, 

repeat 

Set 3^3 + 1; 

Test the reliability of views, and use Gaussian distribution cr?), 

i = 1, 2, m, to describe the error of views; _ 

Assign weight Wi = i) (Equation |l^), 

z = 1, 2, m, to views; 

Calculate the view u* {j + 1) ^ (Equation III); 

ifP{\e^3 + l)\<eT)>Ror 
P{\e*{j + 1)1 < ct) - P(|e*(j)| < er) < -D then 
I Set u* — u* {j + 1); 

I return u*; 
else 

I Simulate the zth man to provide more trustworthy view if 

I P{lei(j + 1)1 < bt) - P(|eiO)| < bt) > D, i = 1,2, ..., m; 

end 

until 7; 


If so, it outputs the final view u*(j + 1) by the weight 
assignment of wi,W2, in Equation •ED- Else, it checks 

the confidence improvement of each view provided by men 
separately. In detail, for the zth man, CIUV calculates the 
probability P(|ei(j + 1)| < bt), subject to Wi = 1 and 
Wk = 0,k i,k = 1,2, If P{\ei{j + 1)| < ct) - 

-P(|6j(j)l < 6 t) < D, CIUV believes that it is hard to 
stimulate the zth man to provide more trustworthy views, and 
stops the effect on him; else CIUV continues to stimulate this 
man by means of incentivization or punishment in the next 
iteration. 

D. summary 

Here we summarize the overall flow of CIUV process in 
Algorithmic The time complexity of the CIUV process is in 
linear with the total number of m views provided by men, i.e. 
0{Cm), where C is related to the number of iterations, the 
time complexity on testing the reliability of views, and the 
time complexity on calculating the truth view u*. 

V. Error Analysis 

Consider the worst case of the error e* on u* . Let the 
ground truth view zt® = Y^^=i wfui (Equation (m)), subject to 
E™ 1 = 1’ while u* = X;™ 1 mui, subject to ^ Wi = 

1. Without loss of generality, assume that the worst case is the 
distance e* = u® — u* reaches the maximal value. We have 

m 

— u* <— Wi)\ui\ (14) 

i=l 

where \ui\ is equal to d{ui,0) as stated in Section [h] Let 
\umax\ = max{|zzi|, |ZZ 2 |,..., |um|}. We have the following 
theorem; 


Theorem 1. The view u* output by CIUV is in distance of 
\umax\ to the ground truth u® in worst case. 


Proof: Eor the zth view, we have either zuf > Wi or zuf < 
Wi. Taking out the negative parts, wf — Wi < 0, z = 1, 2,..., m, 
we can rewrite Inequality (14 1 as: 


P-U* <T,T=l,xu^^>n,Stuf -W^)\U^\ 

< \Umax\T,T=l,w<^>w,iwf - Wi) 


Considering 




W} 


= 1 and YZw 


(15) 


Wi = 1, 


we have — u* < |zzmaa:| at worse. I 

Besides, we can easily see u* = zz® without error at best. 


Views 

Full Name 

Mean Er¬ 
ror 

standard 

Deviation 

“FCE 

Final Consumption Expenditure 

2.4069 

13391 

GCF 

Gross Capital Formation 

3.8193 

2.9389 

NE 

Net Exports 

33.6287 

34.5794 

GDP_EA 

GDP by The Expenditure Ap¬ 
proach 

1.2462 

0.9685 

NPT 

Net Production Tax 

3.9390 

3.4461 

WC 

Worker Compensation 

4.1153 

5.3371 

DFA 

Depreciation of Fixed Assets 

3.6984 

2.3672 

BB 

Business Balance 

10.7253 

14.3010 

GDP lA 

GDP by The Income Approach 

3.0893 

3.6595 

FI 

GDP of The First Industry 

4.8382 

3.2961 

SI 

GDP of The Secondary Industry 

1.6570 

1.1663 

TI 

GDP of The Tertiary Industry 

2.6926 

1.9201 

GDP PA 

GDP by The Productive Approach 

0 

0 


TABLE II 

Error with malicious view mv = 0 and manipulation factor 
mf = 1 


Method 

Mean Enor 

Standard Deviation 

CIUV 

0.9214 

03435 

Mean 

3.2013 

3.8494 

Median 

1.0206 

1.1076 

Voting 

1.1277 

1.1946 

3-Sources 

1.4111 

1.5874 


VI. Experiment 

In this section, we give experimental results on GDP dataset 
under various scenarios. The experimental results show the 
higher accuracy of CIUV compared with other methods. Next 
we first introduce the experimental setting in Section |VI-A| 
and then conduct the experiments on varying malicious views, 
varying manipulation factors, varying improvement factor in 
Section [Vl-B[ |VI-Cl and |VI-Dl respectively. 

A. Experimental Setting 

Here we describe the adopted GDP dataset, introduce the 
comparison methods, and define impact factors on experiment. 

The GDP dataset. We choose the GDP statistics of China 
in 1994 - 2014 H. The GDP statistics contain 3 independent 
parts, statistics by the expenditure approach, income approach, 
and productive approach respectively. These statistics are listed 
in Table [I] We have the following relationship of statistics: 

r GDP_EA = FCE + GCF + NE 

IgDP_IA = NPT+ WC E DFA +BB (16) 

\ GDP_PA = FI + SI + TI 

All of these statistics can be provided as indicators for GDP 
growth. We take them as views of GDP growth rate (from 1995 
to 2014). Since we do not actually know the most trustworthy 
view, we choose GDP PA as the grouchiest. The mean and 
standard deviation of the other views to GDP PA are also 
listed in Table ||] Note that the statistics of GDP_PA, ECE, 
GCE, and NE in 2014, the statistics of GDPJA, NPT, WC, 
DEA and BB in 2004, 2008, and 2011-2014 are not given in 
the official site of National Bureau of Statistics 13 . We use 
the GDP growth rate of their previous year to All these blanks. 
Besides, the GDP dataset has a continuous value space. There 
exists categorical datasets under discrete value space. In this 
case, we can code the datasets with vectors to compute the 
difference between views. 

The methods. We implement four other methods besides 
CIUV for comparison. Two of these methods are Mean, and 
Median. 





















TABLE III 

Error comparison with varying malicious view mv 


Method 

Mean hrror 

Standard Deviation 

Method 

Mean Error 

Standard Deviation 

; CIUV 

0324 

0174 

2 CIUV 

217733 

r4991 

mv — 3 

3.3702 

4.1237 

mv -6 Mean 

3.7933 

4.4065 

Median 

1.9483 

1.3649 

Median 

2.7392 

1.5693 

Voting 

1.9091 

1.7143 

Voting 

2.5698 

1.9648 

3-Sources 

2.4202 

2.0739 

3-Sources 

2.2801 

2.8040 

Method 

Mean hrror 

Standard Deviation 

Method 

Mean Error 

Standard Deviation 


2:2539 

rsiDO 

LL7 m CIUV 

2.9077 

0306 

mv — 9 Mean 

4.3051 

4.6010 

mv — 12 Mean 

5.3167 

5.4336 

Median 

3.2303 

1.7488 

Median 

4.3290 

2.8743 

Voting 

2.9949 

1.9031 

Voting 

3.0689 

1.9747 

3-Sources 

3.3172 

2.4766 

3-Sources 

4.4670 

1.9202 






(a) mv = 3 (b) mv = 6 (c) mv = 9 (d) mv = 12 


Fig. 3. Error comparison with varying malicious view mv 


Mean: Use the mean value of the views ui,U 2 , ■ 

,.., U^n 

to 

estimate u*; 



Median: Use the median value (when m%2 

= 1) 

or 

the average of two median value (when m%2 

= 0) 

to 

estimate u*. 




Also, we can find too many truth-finding methods 
HI El 00113 in research literatures. They are applied 
in different scenarios, and we cannot simply determine the 
best method out of them. The basic idea behind them can be 
mainly concluded into two categories: finding the most likely 
value by manipulating the data itself, and calculating the final 
result based on the reliability of sources. Thus we implement 
two methods. Voting and K-sources, in spirit related to the 
two categories respectively. 

• Voting: Let any two views Ui and Uj vote their distance 
d{ui,Uj) to each other, and choose the view nearest to 
all the other views as m*; 

• K-sources: Based on prior information, calculate u* by 
the mean value of k most trustworthy views; 

In our experiment, the distance d{ui,Uj) is equal to the 
difference of GDP growth rate from Ui to Uj. For the K- 
sources method, we set fc = 3, and let the prior information 
be the randomly selected 10 years’ statistics by default. For 
CIUV method, we ask 10 randomly selected problems about 
each year’s statistics as well. 

The experimental factors. We conduct experiments on three 
varying factors: 

• Malicious View mv: the number of views manipulated by 
malicious men to prevent others from finding the truth; 

• Manipulation Factor mf: the degree of multiplying factor 
to the original view, i.e., u'^ = mf ■ up, 

• Improvement Factor if: the degree of accuracy im¬ 
provement in each iteration of CIUV process, using the 
following equation 


Mj)-Ug\ 


(17) 


The Equation is raised with the intuition that the 
accuracy improvement by means of incentivization or punish¬ 
ment is fastest initially, and then slows down during iteration 
process. In Experiment, we set a = 0.1. The comparison of 
methods with malicious view mv = 0 and manipulation factor 
mf = 1 (namely without malicious manipulation) is shown 
in Table HIl 


B. Varying Malicious View 

Here the experiments are conducted with varying mali¬ 
cious view mv = 3, 6, 9 and 12. The manipulation factor 
mf is set to 1.2. In experiment, the malicious views are 
randomly selected by 10 times for each mv = 3,6,9,12. 
The corresponding results on average error from 1995 to 
2014 is shown in Figure and the mean and standard 
deviation of error comparison is shown in Table III From 


the comparison results, we can see that (1) the error grows as 
the malicious views increases; (2) CIUV perform best among 
these methods (with the minimal mean and standard deviation 
of error for all mv = 3,6,9,12), since CIUV considers both 
mean and variance of error in its design; (3) for the other 
methods, with informal representation, we have the average 
error Voting < Median < 3-Sources < Mean. Besides, when 
changing the manipulation factor mf under malicious view 
mv = 3,6,9,12, we can get similar results. 


C. Varying Manipulation Factor 

We experiment with varying manipulation factor mf = 1.4 
and 1.6. (The case mf = 1.2 can be seen in previous sub¬ 
section.) The malicious view mv, randomly selected by 10 
times, is set to 6. The corresponding results on average error 
from 1995 to 2014 is shown in Figure and the mean and 
standard deviation of error comparison is shown in Table 
IV It can be seen that (1) the error grows drastically when 
manipulation factor increases compared with the change of 
malicious views; (2) CIUV has stable performance with the 
least error compared with other methods; (3) for the other 















































TABLE IV 

Error comparison with varying manipulation factor mf 


Method 

Mean Error 

Standard Deviation 

Method 

Mean Error 

Standard Deviation 


3.5912 

177097 


TmA 

177746 

- 1-4 Mean 

4.9291 

4.7829 

mf - 1.6 

6.2661 

5.0928 

Median 

4.5498 

2.1737 

Median 

6.3644 

3.0368 

Voting 

4.5198 

2.6655 

Voting 

6.7461 

3.3210 

3-Sources 

4.1925 

2.0899 

3-Sources 

5.9709 

2.3558 





Fig. 4. Error comparison with varying manipulation factor mf 


Fig. 5. Error comparison with varying improvement factor if 


methods, we have the average error 3-Sources < Median 
Voting < Mean. 


D. Varying Improvement Factor 

We experiment on varying improvement factor if = 0.1, 
0,2, 0.3 and 0.4. The manipulation factor mf is set to 1.6. 
Let the cost be the total times of effect applying on views, 
i.e., when malicious view mv = 3, the times of effect is 3 
in each iteration. The result s of m alicious view m v = 3 and 
mv = 6 are shown in Figure [5(a)] and Figure |5(b)| respectively. 
The results show that (1) the error decreases faster when we 
have smaller improvement factor; (2) the costs almost double 
to reach a same level of stable error when changing mv = 3 
to mv = 6. In addition, we can get simil ar res ults under other 
parameter settings. Here we use Figure 5(a) of mv = 3 and 
Figure [5^ of mv = 6 as the representation of similar results. 


VII. Related Work 

Finding the truth on resolving conflicts from multiple 
sources has been studied for many years noma. An early 

common method Eunma is to average (or to say vote) 

those conflicts to calculate a truth. The main idea behind it 

is to consider the view that appears in most times or is most 

similar to many other views. However, this type of method 
suffers from large error when there exist sources with low 
quality views. 

To deal with the problem of low quality views, many meth¬ 
ods were proposed to find the truth based on heuristic clues, 
i.e., prior knowledge on facts nuna, source dependency 
naini, sensitivity and specificity M- Usually, this type of 
method uses the clues to evaluate the reliability of sources, 
and calculate a truth by weighting the sources. 

Learning from crowd is another related held to out work. 
It infers true values from the data labeled by a crowd. The 
methods ifT^ 1201 ll^ ll2^ 1^ proposed in this research held 
usually focus on specihc application scenarios. 


VIII. Conclusion AND Future Work 

We propose CIUV, collaborating information to form a 
truth against unreliable views provided by men with malicious 


intension. CIUV contains 3 stages, for interactively mitigating 
the impact of unreliable views, and calculating the truth with 
the consideration of both mean and variance of error. We 
theoretically analyze the bound of error when implementing 
CIUV. In experiment, we verify the feasibility and efficiency 
of CIUV with varying impact factors. 

In this paper, we assume that men provide their views 
independently. In the future, we will consider the case that 
men collude to provide bias views. 
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